Papers
Topics
Authors
Recent
Search
2000 character limit reached

Redefining Digital Health Interfaces with Large Language Models

Published 5 Oct 2023 in cs.CL | (2310.03560v3)

Abstract: Digital health tools have the potential to significantly improve the delivery of healthcare services. However, their adoption remains comparatively limited due, in part, to challenges surrounding usability and trust. LLMs have emerged as general-purpose models with the ability to process complex information and produce human-quality text, presenting a wealth of potential applications in healthcare. Directly applying LLMs in clinical settings is not straightforward, however, with LLMs susceptible to providing inconsistent or nonsensical answers. We demonstrate how LLM-based systems can utilize external tools and provide a novel interface between clinicians and digital technologies. This enhances the utility and practical impact of digital healthcare tools and AI models while addressing current issues with using LLMs in clinical settings such as hallucinations. We illustrate LLM-based interfaces with the example of cardiovascular disease risk prediction. We develop a new prognostic tool using automated machine learning and demonstrate how LLMs can provide a unique interface to both our model and existing risk scores, highlighting the benefit compared to traditional interfaces for digital tools.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. Sutton, R. T. et al. An overview of clinical decision support systems: benefits, risks, and strategies for success. npj Digit. Med. 3 (1), 17 (2020) . (2) Dunn, J., Runge, R. & Snyder, M. Wearables and the medical revolution. Per. Med. 15 (5), 429–448 (2018) . (3) Eichler, K., Zoller, M., Tschudi, P. & Steurer, J. Barriers to apply cardiovascular prediction rules in primary care: A postal survey. BMC Fam. Pract. 8, 1–7 (2007) . (4) Mathews, S. C. et al. Digital health: A path to validation. npj Digit. Med. 2 (1), 38 (2019). 10.1038/s41746-019-0111-3 . (5) Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dunn, J., Runge, R. & Snyder, M. Wearables and the medical revolution. Per. Med. 15 (5), 429–448 (2018) . (3) Eichler, K., Zoller, M., Tschudi, P. & Steurer, J. Barriers to apply cardiovascular prediction rules in primary care: A postal survey. BMC Fam. Pract. 8, 1–7 (2007) . (4) Mathews, S. C. et al. Digital health: A path to validation. npj Digit. Med. 2 (1), 38 (2019). 10.1038/s41746-019-0111-3 . (5) Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Eichler, K., Zoller, M., Tschudi, P. & Steurer, J. Barriers to apply cardiovascular prediction rules in primary care: A postal survey. BMC Fam. Pract. 8, 1–7 (2007) . (4) Mathews, S. C. et al. Digital health: A path to validation. npj Digit. Med. 2 (1), 38 (2019). 10.1038/s41746-019-0111-3 . (5) Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mathews, S. C. et al. Digital health: A path to validation. npj Digit. Med. 2 (1), 38 (2019). 10.1038/s41746-019-0111-3 . (5) Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  2. Wearables and the medical revolution. Per. Med. 15 (5), 429–448 (2018) . (3) Eichler, K., Zoller, M., Tschudi, P. & Steurer, J. Barriers to apply cardiovascular prediction rules in primary care: A postal survey. BMC Fam. Pract. 8, 1–7 (2007) . (4) Mathews, S. C. et al. Digital health: A path to validation. npj Digit. Med. 2 (1), 38 (2019). 10.1038/s41746-019-0111-3 . (5) Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Eichler, K., Zoller, M., Tschudi, P. & Steurer, J. Barriers to apply cardiovascular prediction rules in primary care: A postal survey. BMC Fam. Pract. 8, 1–7 (2007) . (4) Mathews, S. C. et al. Digital health: A path to validation. npj Digit. Med. 2 (1), 38 (2019). 10.1038/s41746-019-0111-3 . (5) Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mathews, S. C. et al. Digital health: A path to validation. npj Digit. Med. 2 (1), 38 (2019). 10.1038/s41746-019-0111-3 . (5) Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  3. Barriers to apply cardiovascular prediction rules in primary care: A postal survey. BMC Fam. Pract. 8, 1–7 (2007) . (4) Mathews, S. C. et al. Digital health: A path to validation. npj Digit. Med. 2 (1), 38 (2019). 10.1038/s41746-019-0111-3 . (5) Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mathews, S. C. et al. Digital health: A path to validation. npj Digit. Med. 2 (1), 38 (2019). 10.1038/s41746-019-0111-3 . (5) Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  4. Mathews, S. C. et al. Digital health: A path to validation. npj Digit. Med. 2 (1), 38 (2019). 10.1038/s41746-019-0111-3 . (5) Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  5. Müller-Riemenschneider, F. et al. Barriers to Routine Risk-Score Use for Healthy Primary Care Patients: Survey and Qualitative Study. Arch. Intern. Med. 170 (8), 719–724 (2010). 10.1001/archinternmed.2010.66 . (6) Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  6. Abernethy, A. et al. The promise of digital health: Then, now, and the future. NAM perspect. (2022) . (7) Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ratwani, R. M., Reider, J. & Singh, H. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  7. A Decade of Health Information Technology Usability Challenges and the Path Forward. JAMA 321 (8), 743–744 (2019). 10.1001/jama.2019.0161 . (8) Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Howe, J. L., Adams, K. T., Hettinger, A. Z. & Ratwani, R. M. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  8. Electronic Health Record Usability Issues and Potential Contribution to Patient Harm. JAMA 319 (12), 1276–1278 (2018). 10.1001/jama.2018.1171 . (9) Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  9. Shanafelt, T. D. et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin. Proc. 91 (7), 836–848 (2016). 10.1016/j.mayocp.2016.05.007 . (10) Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  10. Gardner, R. L. et al. Physician stress and burnout: The impact of health information technology. J. Am. Med. Inform. Assoc. 26 (2), 106–114 (2019) . (11) Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  11. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Health. J. 8 (2), e188–e194 (2021). 10.7861/fhj.2021-0095 . (12) Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rajpurkar, P., Chen, O., Emmaand Banerjee & Topol, E. J. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  12. AI in health and medicine. Nat. Med. 28 (1), 31–38 (2022). 10.1038/s41591-021-01614-0 . (13) Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Asan, O., Bayrak, A. E. & Choudhury, A. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  13. Artificial intelligence and human trust in healthcare: focus on clinicians. J. Med. Internet Res. 22 (6), e15154 (2020) . (14) Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Goldfarb, A. & Teodoridis, F. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  14. Why is AI adoption in health care lagging? (2022). URL https://www.brookings.edu/articles/why-is-ai-adoption-in-health-care-lagging/. (15) Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  15. The potential for artificial intelligence in healthcare. Future Health. J. 6 (2), 94–98 (2019). 10.7861/futurehosp.6-2-94 . (16) Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  16. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 (1), 195 (2019). 10.1186/s12916-019-1426-2 . (17) Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  17. Gage, B. F. et al. Validation of clinical classification schemes for predicting strokeresults from the national registry of atrial fibrillation. JAMA 285 (22), 2864–2870 (2001). 10.1001/jama.285.22.2864 . (18) Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Hippisley-Cox, J., Coupland, C. & Brindle, P. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  18. Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: Prospective cohort study. BMJ 357 (2017). 10.1136/bmj.j2099 . (19) Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Cebere, B., McKinney, E. F. & van der Schaar, M. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  19. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digit. Health 2 (6), 1–21 (2023). 10.1371/journal.pdig.0000276 . (20) Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rothman, M. J., Rothman, S. I. & Beals, J. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  20. Development and validation of a continuous measure of patient condition using the electronic medical record. J. Biomed. Inform. 46 (5), 837–848 (2013). 10.1016/j.jbi.2013.06.011 . (21) Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  21. Food and Drug Administration and others. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) (2019) . (22) Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Mourby, M., Ó Cathaoir, K. & Collin, C. B. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  22. Transparency of machine-learning in healthcare: The GDPR & European health law. Comput. Law Secur. Rev. 43, 105611 (2021) . (23) Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  23. Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 (7956), 259–265 (2023). 10.1038/s41586-023-05881-4 . (24) Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  24. Singhal, K. et al. Large language models encode clinical knowledge. Nature (2023). 10.1038/s41586-023-06291-2 . (25) Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lecler, A., Duron, L. & Soyer, P. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  25. Revolutionizing radiology with GPT-based models: Current applications, future possibilities and limitations of ChatGPT. Diagn. Interv. Imaging 104 (6), 269–274 (2023). 10.1016/j.diii.2023.02.003 . (26) Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Maynez, J., Narayan, S., Bohnet, B. & McDonald, R. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  26. On faithfulness and factuality in abstractive summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1906–1919 (2020). 10.18653/v1/2020.acl-main.173 . (27) Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  27. Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55 (12) (2023). 10.1145/3571730 . (28) Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Patel, A., Bhattamishra, S. & Goyal, N. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  28. Are NLP models really able to solve simple math word problems? In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2080–2094 (2021). 10.18653/v1/2021.naacl-main.168 . (29) Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  29. Schick, T. et al. Toolformer: Language models can teach themselves to use tools. Adv. Neural Inf. Process. Syst. 36 (2023) . (30) Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Komeili, M., Shuster, K. & Weston, J. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  30. Internet-augmented dialogue generation. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics 1, 8460–8478 (2022) . (31) Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  31. Luo, R. et al. BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Brief. Bioinformatics 23 (6), bbac409 (2022). 10.1093/bib/bbac409 . (32) Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  32. Yang, X. et al. A large language model for electronic health records. npj Digit. Med. 5 (1), 194 (2022). 10.1038/s41746-022-00742-2 . (33) Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  33. Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature (2023). 10.1038/s41586-023-06160-y . (34) Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lakkaraju, H., Slack, D., Chen, Y., Tan, C. & Singh, S. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  34. Rethinking explainability as a dialogue: A practitioner’s perspective. arXiv preprint arXiv:2202.01875 (2022) . (35) Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G., Royston, P., Vergouwe, Y., Grobbee, D. E. & Altman, D. G. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  35. Prognosis and prognostic research: what, why, and how? BMJ 338, b375 (2009). 10.1136/bmj.b375 . (36) Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Muthiah, V., A., M. G., Varieur, T. J., Valentin, F. & A., R. G. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  36. The global burden of cardiovascular diseases and risk. J. Am. Coll. Cardiol. 80 (25), 2361–2371 (2022). 10.1016/j.jacc.2022.11.005 . (37) D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  37. D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care. Circulation 117 (6), 743–753 (2008). 10.1161/CIRCULATIONAHA.107.699579 . (38) SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  38. SCORE2 working group & ESC Cardiovascular risk collaboration. SCORE2 risk prediction algorithms: new models to estimate 10-year risk of cardiovascular disease in Europe. Eur. Heart J. 42 (25), 2439–2454 (2021). 10.1093/eurheartj/ehab309 . (39) Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  39. Sudlow, C. et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Med. 12 (3), 1–10 (2015). 10.1371/journal.pmed.1001779 . (40) Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  40. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 (282), 457–481 (1958) . (41) Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  41. Parsons, R. E. et al. Independent external validation of the QRISK3 cardiovascular disease risk prediction model using UK Biobank. Heart 109 (22), 1690–1697 (2023). 10.1136/heartjnl-2022-321231 . (42) Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. & Elkin, E. B. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  42. Decision curve analysis: A novel method for evaluating prediction models. Med. Decis. Making 26 (6), 565–574 (2006). 10.1177/0272989X06295361 . (43) Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  43. Vickers, A. J. Decision analysis for the evaluation of diagnostic tests, prediction models, and molecular markers. Am. Stat. 62 (4), 314–320 (2008). 10.1198/000313008X370302 . (44) Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  44. Moons, K. G. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162 (1), W1–W73 (2015). 10.7326/M14-0698 . (45) Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  45. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017) . (46) Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  46. Marston, N. A. et al. Association of apolipoprotein B–containing lipoproteins and risk of myocardial infarction in individuals with and without atherosclerosis: Distinguishing between particle concentration, type, and content. JAMA Cardiol. 7 (3), 250–256 (2022). 10.1001/jamacardio.2021.5083 . (47) Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  47. Behbodikhah, J. et al. Apolipoprotein b and cardiovascular disease: Biomarker and potential therapeutic target. Metabolites 11 (10), 690 (2021). 10.3390/metabo11100690 . (48) Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  48. Erqou, S. et al. Apolipoprotein(a) isoforms and the risk of vascular disease: Systematic review of 40 studies involving 58,000 participants. J. Am. Coll. Cardiol. 55 (19), 2160–2167 (2010). 10.1016/j.jacc.2009.10.080 . (49) Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J. H. F. & van der Schaar, M. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  49. Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLoS One 14 (5), 1–17 (2019). 10.1371/journal.pone.0213653 . (50) Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  50. Nakano, R. et al. WebGPT: Browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021) . (51) OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  51. OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023) . (52) Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Imrie, F., Davis, R. & van der Schaar, M. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  52. Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare. Nat. Mach. Intell. 5 (8), 824–829 (2023). 10.1038/s42256-023-00698-2 . (53) National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  53. National Institute for Health and Care Excellence. Cardiovascular disease: risk assessment and reduction, including lipid modification (2014). URL https://www.nice.org.uk/guidance/cg181. Last updated: 24 May 2023. (54) Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  54. Taylor, R. et al. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085 (2022) . (55) Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Skjuve, M., Følstad, A. & Brandtzaeg, P. B. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  55. The user experience of ChatGPT: Findings from a questionnaire study of early users. In: Proceedings of the 5th International Conference on Conversational User Interfaces (2023). 10.1145/3571884.3597144 . (56) Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  56. Callender, T. et al. Assessing eligibility for lung cancer screening using parsimonious ensemble machine learning models: A development and validation study. PLOS Medicine 20 (10), e1004287 (2023). 10.1371/journal.pmed.1004287 . (57) Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Uno, H., Cai, T., Pencina, M. J., D’Agostino, R. B. & Wei, L. J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  57. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 30 (10), 1105–1117 (2011). 10.1002/sim.4154 . (58) Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  58. Brier, G. W. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78 (1), 1–3 (1950) . (59) van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  59. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45 (3), 1–67 (2011). 10.18637/jss.v045.i03 . (60) Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  60. Rubin, D. B. Multiple imputation. Flexible Imputation of Missing Data, Second Edition 29–62 (2018) . (61) Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  61. Greenland, P. et al. 2010 ACCF/AHA guideline for assessment of cardiovascular risk in asymptomatic adults: A report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. J. Am. Coll. Cardiol. 56 (25), e50–e103 (2010). 10.1016/j.jacc.2010.09.001 . (62) Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  62. Yao, S. et al. ReAct: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023) . (63) Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  63. Chase, H. Langchain. https://github.com/hwchase17/langchain (2022). Last accessed: 2023-06-23. (64) Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  64. Dong, Q. et al. A survey on in-context learning. arXiv preprint arXiv:2301.00234 (2023) . (65) Streamlit. https://streamlit.io/. Streamlit. https://streamlit.io/.
  65. Streamlit. https://streamlit.io/.
Citations (4)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.